“ Making effective use of metadata of historical texts and corpora ” 7 - 8 September 2017

نویسندگان

  • Jörg Knappen
  • Katrin Menzel
  • Peter Fankhauser
  • Stefania Degaetano-Ortlieb
  • Julie Weeds
  • Justyna Robinson
چکیده

tba Peter Fankhauser (Institut für Deutsche Sprache, Mannheim), Visual correlation for exploring paradigmatic language change Abstract: Paradigmatic language change occurs when paradigmatically related words with similar usage context rise or fall together. We introduce an approach to explore such paradigmatic change in diachronic corpora by visually correlating two factors: Frequency change and distributional semantics of words. Frequency change is visualized by means of color derived from the slope of a logistic growth curve fitted to the frequency trend. Semantics of words is visualized by positioning them in two dimensions such that words with similar usage contexts are positioned closely together. As a result we get islands of paradigmatically related words with similar color that can act as a guide for exploring language change. Paradigmatic language change occurs when paradigmatically related words with similar usage context rise or fall together. We introduce an approach to explore such paradigmatic change in diachronic corpora by visually correlating two factors: Frequency change and distributional semantics of words. Frequency change is visualized by means of color derived from the slope of a logistic growth curve fitted to the frequency trend. Semantics of words is visualized by positioning them in two dimensions such that words with similar usage contexts are positioned closely together. As a result we get islands of paradigmatically related words with similar color that can act as a guide for exploring language change. Lousianne Ferlier (Royal Society, London), The Royal Society Journal Collection: unlocking 300 years of scientific periodicals

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Metadiscourse Use in Popular and Professional Science: The Case of Hedges and Boosters

The present article shows that all scientific texts included in journals, magazines, and newspapers are vulnerable to the penetration of hedges and boosters.  However, it was found that scientific texts in the three corpora tended to open up the possibilities of alternative voices rather than narrowing them down. The relatively higher frequency of occurrence of hedges in comparison with booster...

متن کامل

Compiling and Processing Historical and Contemporary Portuguese Corpora

[email protected] University of Cologne, Albertus-Magnus Platz, 50923 Cologne, Germany Abstract This technical report describes the framework used for processing three large Portuguese corpora. Two corpora contain texts from newspapers, one published in Brazil and the other published in Portugal. The third corpus is Colonia, a historical Portuguese collection containing texts written...

متن کامل

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...

متن کامل

Comparative Study of the Academic Vocabulary Content of Electronic Engi-neering Corpora, GE Materials and M.S. Entrance Examinations

The importance of vocabulary learning has been underlined in the field of English for Academic Purposes (EAP) because non-English majors who require reading English texts in their fields of study have to expand their English vocabulary knowledge much more efficiently than ordinary ESL/EFL learners. Since academic vocabulary instruction in Iranian universities is realized through the use of Gene...

متن کامل

Building a Corpus-based Historical Portuguese Dictionary: Challenges and Opportunities

Historical corpora are important resources for different areas. Philology, Human Language Technology, Literary Studies, History, and Lexicography are some that benefit from them. However, compiling historical corpora is different from compiling contemporary corpora. Corpus designers have to deal with several characteristics inherent in historical texts, such as: absence of a spelling standard, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017